Sharing Syntactic Structures
نویسندگان
چکیده
Bracketed corpora are a very useful resource for natural language processing, but hard to build efficiently, leading to quantitative insufficiency for practical use. Disparities in morphological information, such as word segmentation and part-of-speech tag sets, are also troublesome. An application specific to a particular corpus often cannot be applied to another corpus. In this paper, we sketch out a method to build a corpus that has a fixed syntactic structure but varying morphological annotation based on the different tag set schemes utilized. Our system uses a two layered grammar, one layer of which is made up of replaceable tag-set-dependent rules while the other has no such tag set dependency. The input sentences of our system are bracketed corresponding to structural information of corpus. The parser can work using any tag set and grammar, and using the same input bracketing, we obtain corpus that shares partial syntactic structure.
منابع مشابه
Syntactic Structures in Research Article Titles from Three Different Disciplines: Applied Linguistics, Civil Engineering, and Dentistry
Deducing what a paper is about, titles are considered as the most important determinant of how many people will read the article. Therefore, studying the use of different syntactic structures and their rhetorical functions in titles is of great significance. The current study was set to investigate these structures used in research article titles in three disciplines of Applied Linguistics, Den...
متن کاملSyntactic Structures and Rhetorical Functions of Electrical Engineering, Psychiatry, and Linguistics Research Article Titles in English and Persian: A Cross-linguistic and Cross-disciplinary Study
A research article (RA) title is the first and foremost feature that attracts the reader's attention, the feature from which she/he may decide whether the whole article is worth reading. The present study attempted to investigate syntactic structures and rhetorical functions of RA titles written in English and Persian and published in journals in three disciplines of Electrical Engineering, Psy...
متن کاملComparison of the high-frequency morpho-syntactic structures of cochlear implant children and children with normal hearing aged 4-6 years
Introduction: Children with cochlear implants experience problems at all language domains, and have more problems in morpho-syntactic skills than others domains. Considering the importance of morphology and syntax in developing of communication skills of children, this study compared the use of high-frequency morpho-syntactic structures among 4-6 years old children with cochlear implants and ty...
متن کاملBCCWJ-DepPara: A Syntactic Annotation Treebank on the 'Balanced Corpus of Contemporary Written Japanese'
Paratactic syntactic structures are difficult to represent in syntactic dependency tree structures. As such, we propose an annotation schema for syntactic dependency annotation of Japanese, in which coordinate structures are separated from and overlaid on bunsetsu(base phrase unit)-based dependency. The schema represents nested coordinate structures, non-constituent conjuncts, and forward shari...
متن کاملCanonicity Effect on Sentence Processing of Persian-speaking Broca’s Patients
Introduction: Fundamental notions of mapping hypothesis and canonicity were scrutinized in Persian-speaking aphasics. Methods: To this end, the performance of four age-, education-, and gender matched Persian-speaking Broca's patients and eight matched healthy controls in diverse complex structures were compared via the conduction of two tasks of syntactic comprehension and grammaticality jud...
متن کاملApplying a Semantic & Syntactic Comparisons Based Automatic Model Transformation Methodology to Serve Information Sharing
Information sharing, as an aspect of information and knowledge engineering, attracts more and more attention from researchers and practitioners. Since a large amount of cross-domain collaborations are appearing, exchanging and sharing information and knowledge among various domains are inevitable. However, due to the vast quantity and heterogeneous structures of information, it becomes impossib...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008